Motivated by the growing importance of reducing unfairness in ML predictions, Fair-ML researchers have presented an extensive suite of algorithmic 'fairness-enhancing' remedies. Most existing algorithms, however, are agnostic to the sources of the observed unfairness. As a result, the literature currently lacks guiding frameworks to specify conditions under which each algorithmic intervention can potentially alleviate the underpinning cause of unfairness. To close this gap, we scrutinize the underlying biases (e.g., in the training data or design choices) that cause observational unfairness. We present the conceptual idea and a first implementation of a bias-injection sandbox tool to investigate fairness consequences of various biases and assess the effectiveness of algorithmic remedies in the presence of specific types of bias. We call this process the bias(stress)-testing of algorithmic interventions. Unlike existing toolkits, ours provides a controlled environment to counterfactually inject biases in the ML pipeline. This stylized setup offers the distinct capability of testing fairness interventions beyond observational data and against an unbiased benchmark. In particular, we can test whether a given remedy can alleviate the injected bias by comparing the predictions resulting after the intervention in the biased setting with true labels in the unbiased regime-that is, before any bias injection. We illustrate the utility of our toolkit via a proof-of-concept case study on synthetic data. Our empirical analysis showcases the type of insights that can be obtained through our simulations.
translated by 谷歌翻译
Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity
translated by 谷歌翻译
Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.
translated by 谷歌翻译
Language models have been shown to be very effective in predicting brain recordings of subjects experiencing complex language stimuli. For a deeper understanding of this alignment, it is important to understand the alignment between the detailed processing of linguistic information by the human brain versus language models. In NLP, linguistic probing tasks have revealed a hierarchy of information processing in neural language models that progresses from simple to complex with an increase in depth. On the other hand, in neuroscience, the strongest alignment with high-level language brain regions has consistently been observed in the middle layers. These findings leave an open question as to what linguistic information actually underlies the observed alignment between brains and language models. We investigate this question via a direct approach, in which we eliminate information related to specific linguistic properties in the language model representations and observe how this intervention affects the alignment with fMRI brain recordings obtained while participants listened to a story. We investigate a range of linguistic properties (surface, syntactic and semantic) and find that the elimination of each one results in a significant decrease in brain alignment across all layers of a language model. These findings provide direct evidence for the role of specific linguistic information in the alignment between brain and language models, and opens new avenues for mapping the joint information processing in both systems.
translated by 谷歌翻译
This paper aims to provide an unsupervised modelling approach that allows for a more flexible representation of text embeddings. It jointly encodes the words and the paragraphs as individual matrices of arbitrary column dimension with unit Frobenius norm. The representation is also linguistically motivated with the introduction of a novel similarity metric. The proposed modelling and the novel similarity metric exploits the matrix structure of embeddings. We then go on to show that the same matrices can be reshaped into vectors of unit norm and transform our problem into an optimization problem over the spherical manifold. We exploit manifold optimization to efficiently train the matrix embeddings. We also quantitatively verify the quality of our text embeddings by showing that they demonstrate improved results in document classification, document clustering, and semantic textual similarity benchmark tests.
translated by 谷歌翻译
Often questions provided to open-domain question answering systems are ambiguous. Traditional QA systems that provide a single answer are incapable of answering ambiguous questions since the question may be interpreted in several ways and may have multiple distinct answers. In this paper, we address multi-answer retrieval which entails retrieving passages that can capture majority of the diverse answers to the question. We propose a re-ranking based approach using Determinantal point processes utilizing BERT as kernels. Our method jointly considers query-passage relevance and passage-passage correlation to retrieve passages that are both query-relevant and diverse. Results demonstrate that our re-ranking technique outperforms state-of-the-art method on the AmbigQA dataset.
translated by 谷歌翻译
多种业务场景需要从结构化输入数据中自动生成描述性的人类可读文本。因此,已经开发了针对各种下游任务的事实到文本的系统主要是由于相关数据集的高可用性。直到最近,提出了跨语言事实与文本(XF2T)的问题,该问题是针对多种语言的生成,以及一个数据集,Xalign的八种语言。但是,实际上XF2T生成问题没有严格的工作。我们使用另外四种语言的注释数据扩展了Xalign数据集:旁遮普语,马拉雅拉姆语,阿萨姆语和Oriya。我们在扩展的多语言数据集上使用基于变压器的流行文本生成模型进行了广泛的研究,我们称之为Xalignv2。此外,我们研究了不同文本生成策略的性能:预处理,事实感知的嵌入和结构意识的输入编码的多种变化。我们的广泛实验表明,使用具有结构意识的输入编码的事实感知的嵌入式的多语言MT5模型可以平均在十二种语言中获得最佳结果。我们将代码,数据集和模型公开可用,并希望这将有助于进一步在此关键领域进行进一步的研究。
translated by 谷歌翻译
在线新闻和信息来源是方便且可访问的方法来了解当前问题。例如,超过3亿人在全球Twitter上参与帖子,这提供了传播误导信息的可能性。在许多情况下,由于虚假新闻,已经犯了暴力犯罪。这项研究介绍了Covidmis20数据集(Covid-19误导2020数据集),该数据集由2月至2020年7月收集的1,375,592条推文组成。Covidmis20可以自动更新以获取最新新闻,并在以下网址公开,网址为:HTTPPS://GITHUB.COM./github.com./github.com。/一切guy/covidmis20。这项研究是使用BI-LSTM深度学习和合奏CNN+BI-GRU进行假新闻检测进行的。结果表明,测试精度分别为92.23%和90.56%,集合CNN+BI-GRU模型始终提供了比BI-LSTM模型更高的精度。
translated by 谷歌翻译
我们提出了一种使用神经网络反馈控制器对封闭环控制系统进行状态空间探索的新技术。我们的方法涉及近似闭环动力学轨迹的灵敏度。使用这样的近似器和系统模拟器,我们提出了一种指导状态空间探索方法,该方法可以生成在指定时间访问目标状态附近的轨迹。我们提出了一个理论框架,该框架确定我们的方法将产生一系列轨迹,该轨迹将到达目标状态的合适邻居。我们通过不同配置的神经网络反馈控制器对各种系统进行彻底评估。我们的表现优于早期的状态空间探索技术,并在质量(解释性)和性能(收敛速度)方面取得了显着改善。最后,我们采用算法来伪造一类时间逻辑规范,评估其针对最先进的伪造工具的绩效,并表现出其在补充现有的伪造算法方面的潜力。
translated by 谷歌翻译
文本分类是许多自然语言处理任务的组成部分,例如讽刺检测,情感分析和更多此类应用。许多电子商务网站,社交媒体/娱乐平台都使用此类模型来增强用户体验以产生流量,从而在其平台上获得收入。在本文中,我们将在Sharechat提供的印度视频共享社交网络服务MOJ上介绍了多语言滥用评论标识问题。该问题涉及在MOJ平台上的视频上使用13种区域性指示语言(例如印地语,泰卢固语,卡纳达语等)中检测滥用评论的问题。我们的解决方案利用了新颖的Muboost,这是印度语言模型(Muril)模型的Catboost分类器模型和多语言表示的合奏,以在指示文本分类任务上产生SOTA性能。我们能够在测试数据上达到平均F1分数为89.286,这比基线Muril模型的改进,F1分数为87.48。
translated by 谷歌翻译